Grammar Sharing Techniques for Rule-based Multilingual NLP Systems

نویسنده

  • Marianne Santaholma
چکیده

Rule-based multilingual natural language processing (NLP) applications such as machine translation systems require the development of grammars for multiple languages. Grammar writing, however, is often a slow and laborious process. In this paper we describe a methodology for multilingual and multipurpose grammar development based on grammar sharing. This paper presents the first step towards a language independent core grammar used for recognition, analysis and generation of English, Japanese and Finnish used in a domain specific spoken language translation system. The paper focuses on the grammar architecture and rule writing principles. Evaluation on analysis and generation has shown that two thirds of the rules are shared between these three typologically different languages.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Practical Experience With Grammar Sharing In Multilingual NLP

In the Microsoft Natural Language Processing System (MSNLP), grammar sharing between English, French, Spanish, and German has been an important means for speeding up the development time for the latter grammars. Despite significant typological differences between these languages, a mature English grammar was taken as the starting point for each of the other three grarnmars. In each case, throug...

متن کامل

Fips, A ``Deep'' Linguistic Multilingual Parser

The development of robust “deep” linguistic parsers is known to be a difficult task. Few such systems can claim to satisfy the needs of large-scale NLP applications in terms of robustness, efficiency, granularity or precision. Adapting such systems to more than one language makes the task even more challenging. This paper describes some of the properties of Fips, a multilingual parsing system t...

متن کامل

Comparing Speech Recognizers Derived from Mono- and Multilingual Grammars

This paper examines the performance of multilingual parameterized grammar rules on speech recognition. We present a performance comparison of two different types of Japanese and English grammar-based speech recognizers. One system is derived from monolingual grammar rules and the other from multilingual parameterized grammar rules. The latter one uses hence the same grammar rules for creation o...

متن کامل

Multilingual Grammar Resources in Multilingual Application Development

Grammar development makes up a large part of the multilingual rule-based application development cycle. One way to decrease the required grammar development efforts is to base the systems on multilingual grammar resources. This paper presents a detailed description of a parametrization mechanism used for building multilingual grammar rules. We show how these rules, which had originally been des...

متن کامل

A rule-based Afan Oromo Grammar Checker

Natural language processing (NLP) is a subfield of computer science, with strong connections to artificial intelligence. One area of NLP is concerned with creating proofing systems, such as grammar checker. Grammar checker determines the syntactical correctness of a sentence which is mostly used in word processors and compilers. For languages, such as Afan Oromo, advanced tools have been lackin...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007